MIR rewrite#9299
Open
rtfeldman wants to merge 1289 commits into
Open
Conversation
Comment on lines
-23
to
-32
| /// Comptime-gated tracing for the shared lowering pipeline. | ||
| const trace = struct { | ||
| const enabled = if (@hasDecl(build_options, "trace_eval")) build_options.trace_eval else false; | ||
|
|
||
| fn log(comptime fmt: []const u8, args: anytype) void { | ||
| if (comptime enabled) { | ||
| std.debug.print("[lower] " ++ fmt ++ "\n", args); | ||
| } | ||
| } | ||
| }; |
Collaborator
There was a problem hiding this comment.
I found this tracing helpful for tracking down bugs, do we want to remove it?
43b6417 to
e5bb427
Compare
…loop # Conflicts: # src/cli/repl.zig # src/repl/eval.zig # src/repl/repl_test.zig # test/snapshots/repl/anno_only_crash.md
The semantic audit and post-check architecture scripts are perl, which isn't preinstalled on Windows. Print a skip notice instead of failing; Linux/macOS CI still enforces these gates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ValueStorage(...) is a generic factory that is never instantiated. Only its ValueLoc and NumKind enums are consumed, and the per-arch backends in x86_64/CodeGen.zig and aarch64/CodeGen.zig duplicate the same bitmask-freelist allocator logic directly. Document the two paths forward (delete the factory and move enums next to consumers, vs. finish wiring both backends through this layer) so the next person touching it knows it is stalled scaffolding. The i128 multiply register-clobber fix that originally accompanied this TODO landed independently upstream as 81bdb69 and was absorbed by the rebase. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Windows previously fell through to runSequential because fork() doesn't exist. Add a parallel executor that re-invokes the runner binary with `--worker <idx>` for each test, reads the serialized result from stdout, and runs N such children concurrently. JobObject with KILL_ON_JOB_CLOSE ensures workers tear down with the parent. The eval runner's runSingleTest is reused as-is in worker mode. On Phase-1 failures, the runner re-spawns the failing test once per implemented backend with `--worker-backend <name>` so it can attribute which backend crashed (POSIX gets this for free via forkAndEval). A new `onTestStarted` callback moves the per-test "RUN" log line into the parent so it stays coherent across N workers. POSIX uses the same callback from launchChild — strictly an improvement over the previous print-from-forked-child which could interleave. Default workers on Windows capped at min(cpu_count, 4) because each worker reserves ~2TB of virtual address space for the LIR shared-memory image; more concurrent workers trip MapViewOfFileFailed. Users can override with --threads N. Verified on Windows with the full eval suite: 1200/1201 pass in 75s using 4 processes (vs 124s sequential on inspect: subset). The one failure is a transient MapViewOfFileFailed in Phase 1 where Phase-2 attribution correctly shows every backend passing in isolation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each parallel worker reserved 2 TB of virtual address space for the LIR runtime image. That reservation was the single biggest contributor to per-test wall-clock — instrumentation showed ~60 ms of runSingleTest's ~195 ms going into CreateFileMapping + MapViewOfFile alone, and the system-wide accounting trips MapViewOfFileFailed when N parallel workers each reserve 2 TB simultaneously, which was why we capped default workers at 4. Drop the Windows reservation to 256 MB (large enough for any eval test; the runtime image actually used is a few MB). Remove the 4-worker cap so default parallelism follows cpu_count. Also adds a ROC_EVAL_TIME_WORKER env var that dumps per-phase timestamps from inside a worker — used to find the 2 TB reservation as the bottleneck. Costs nothing when unset. Full suite (1201 tests) on Windows: Original parallelization (4 workers, 2 TB): 75 s, 1200/1201 pass After reservation shrink (4 workers, 256 MB): 48 s, 1201/1201 pass After lifting cap (16 workers, 256 MB): 27 s, 1201/1201 pass Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the one-Child-per-test executor with a persistent worker pool: each Child spawns once in --worker-stream mode and processes many tests over a stdin/stdout protocol. Parent sends "<idx>\n" per test, child writes a u32 length + serialized result to stdout. Child exits cleanly on stdin EOF. This amortizes process boot (CreateProcess + main + collectTests + filter + builtin loading) across all tests assigned to a worker. The one-shot --worker <N> entry point is kept for the runner's Phase-2 retry path, which needs to vary --worker-backend per spawn and benefits less from amortization. Full eval suite on Windows (cpu_count=16, 256 MB shm reservation): one-shot, --threads 4 83s persistent, --threads 4 74s -11% one-shot, --threads 8 55s persistent, --threads 8 45s -18% one-shot, --threads 16 44s persistent, --threads 16 35s -20% Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Parallelize eval tests on Windows
Zig installs its default segfault handler via AddVectoredExceptionHandler,
which on Windows runs before SetUnhandledExceptionFilter and short-circuits
our handler in src/base/stack_overflow.zig. The previously failing
stack_overflow test exited with code 5 ("Stack Overflow" stderr from Zig's
handler followed by a secondary access violation while dumping the stack)
instead of our intended 134 with the helpful message.
Disable enable_segfault_handler in the test helper unconditionally and in
the roc CLI on Windows only, preserving Zig's dev-friendly POSIX traces.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The parallel eval test runner printed "RUN <name>" for every test as it started, which is noisy under the default 16-worker fan-out. Move it behind --verbose so it only fires when the user opts in. `onTestStarted` is a comptime Pool callback and can't take a closure, so the flag is stored in a module-level `verbose_logging` populated from `cli.verbose` in `main` after arg parsing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test harness's persistent-worker pool unconditionally appends `--worker-stream` to every child it spawns (src/build/test_harness.zig runChildPool). The eval runner already handled the flag, but parallel_cli_runner did not: workers parsed the flag, ignored it, and fell through to the parent path, where they reentrantly spawned their own pool of N child workers — each launched with `--worker-stream` again. The resulting fork-bomb made every worker run another `roc build`, and `roc build` reserves a 2 TB SEC_RESERVE on Windows (src/cli/main.zig SHARED_MEMORY_SIZE). The cascading reservations exhausted the system commit limit, which is why sibling test suites running in parallel (roc_subcommands_test, glue_test) failed with CreateProcessW => COMMITMENT_LIMIT / SystemResources and OutOfMemory. Add the streaming worker branch: read decimal test indices from stdin (one per line), run each test, write a u32-length-prefixed result to stdout, loop until EOF. Mirrors src/eval/test/parallel_runner.zig. Before: test-platforms reported 0 passed / 91 crashed / 16 hung in 62s on Windows, and test-subcommands / test-glue failed with COMMITMENT_LIMIT and OutOfMemory. After: test-platforms is 91 passed / 16 failed in 6.5s (the 16 are real test errors that were masked by the fork-bomb), test-subcommands and test-glue pass cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Wouldn't be a proper rewrite without a rewrite-rewrite.
Fixes #9388
Fixes #9389
Fixes #9390
Fixes #9391
Fixes #9392
Fixes #9393
Fixes #9395
Fixes #9396